大多数机器学习(ML)方法假设训练阶段使用的数据来自目标人群。但是,实际上,一个经常会面对数据集偏移,如果考虑到未正确考虑的话,可能会降低ML模型的预测性能。通常,如果从业人员知道正在发生哪种轮班类型(例如,协变量转移或标签转移),他们可能会采用转移学习方法来获得更好的预测。不幸的是,当前用于检测移位的方法仅设计用于检测特定类型的偏移或无法正式测试其存在。我们介绍了一个一般和统一的框架,该框架通过检测不同类型的变化和量化它们的强度来提供有关如何改善预测方法的见解。我们的方法可用于任何数据类型(表格/图像/文本)以及分类和回归任务。此外,它使用正式的假设测试来控制虚假警报。我们说明了我们的框架在实践中使用人工和真实数据集的实践有用,包括一个示例,说明了我们的框架如何导致洞察力确实可以提高监督模型的预测能力。我们用于数据集偏移检测的软件包可以在https://github.com/felipemaiapolo/detectshift中找到。
translated by 谷歌翻译
我们解剖了使用真实数据开发的实验信用评分模型,并演示 - 无需访问受保护的属性 - 使用位置信息的使用方式引入种族偏见。我们借助于游戏理论启发的机器学习解释技术,反事实实验和巴西人口普查数据来分析树梯度升压模型。通过曝光算法种族偏置解释训练有素的机器学习模型内部机制,该实验包括一个有趣的伪像,以帮助努力了解机器学习系统中种族偏置的出现的理论理解。如果没有访问个人的种族类别,我们就会显示使用地理定义的组的分类奇偶校验措施如何携带有关模型种族偏见的信息。该实验证明了在审计ML模型时,不需要预先预设对受保护属性的方法和语言的方法,在解决种族问题时考虑区域细节的重要性以及人口普查数据在AI研究界中的核心作用。据我们所知,这是第一个在巴西的ML基于ML的算法种族偏见的算法偏差,是世界上第二大黑人群的基于ML的信用评分。
translated by 谷歌翻译
在监督学习中,培训和测试数据集通常从不同的分布中采样。因此需要域改性技术。当域才因特征边际分布而不同时,协变速适配会产生良好的泛化性能。 Covariate换档适应通常使用重要性加权实施,这可能根据常见智慧而失败,由于较小的有效样本尺寸(ESS)。以前的研究认为,这种情况在高维设置中更常见。然而,考虑到协变转变适应的背景,在监督学习中,如何在监督学习方面与效率有效,维度和模型性能/泛化是多么难以置信。因此,主要挑战是呈现连接这些点的统一理论。因此,在本文中,我们专注于构建连接ESS,数据维度和泛化在协变速改编的背景下的统一视图。此外,我们还证明了减少量度或特征选择如何增加ESS,并认为我们的结果在协会变化适应之前支持维度减少,作为一种良好的做法。
translated by 谷歌翻译
Neural Radiance Fields (NeRFs) are emerging as a ubiquitous scene representation that allows for novel view synthesis. Increasingly, NeRFs will be shareable with other people. Before sharing a NeRF, though, it might be desirable to remove personal information or unsightly objects. Such removal is not easily achieved with the current NeRF editing frameworks. We propose a framework to remove objects from a NeRF representation created from an RGB-D sequence. Our NeRF inpainting method leverages recent work in 2D image inpainting and is guided by a user-provided mask. Our algorithm is underpinned by a confidence based view selection procedure. It chooses which of the individual 2D inpainted images to use in the creation of the NeRF, so that the resulting inpainted NeRF is 3D consistent. We show that our method for NeRF editing is effective for synthesizing plausible inpaintings in a multi-view coherent manner. We validate our approach using a new and still-challenging dataset for the task of NeRF inpainting.
translated by 谷歌翻译
System identification, also known as learning forward models, transfer functions, system dynamics, etc., has a long tradition both in science and engineering in different fields. Particularly, it is a recurring theme in Reinforcement Learning research, where forward models approximate the state transition function of a Markov Decision Process by learning a mapping function from current state and action to the next state. This problem is commonly defined as a Supervised Learning problem in a direct way. This common approach faces several difficulties due to the inherent complexities of the dynamics to learn, for example, delayed effects, high non-linearity, non-stationarity, partial observability and, more important, error accumulation when using bootstrapped predictions (predictions based on past predictions), over large time horizons. Here we explore the use of Reinforcement Learning in this problem. We elaborate on why and how this problem fits naturally and sound as a Reinforcement Learning problem, and present some experimental results that demonstrate RL is a promising technique to solve these kind of problems.
translated by 谷歌翻译
This chapter sheds light on the synaptic organization of the brain from the perspective of computational neuroscience. It provides an introductory overview on how to account for empirical data in mathematical models, implement them in software, and perform simulations reflecting experiments. This path is demonstrated with respect to four key aspects of synaptic signaling: the connectivity of brain networks, synaptic transmission, synaptic plasticity, and the heterogeneity across synapses. Each step and aspect of the modeling and simulation workflow comes with its own challenges and pitfalls, which are highlighted and addressed in detail.
translated by 谷歌翻译
Image generation and image completion are rapidly evolving fields, thanks to machine learning algorithms that are able to realistically replace missing pixels. However, generating large high resolution images, with a large level of details, presents important computational challenges. In this work, we formulate the image generation task as completion of an image where one out of three corners is missing. We then extend this approach to iteratively build larger images with the same level of detail. Our goal is to obtain a scalable methodology to generate high resolution samples typically found in satellite imagery data sets. We introduce a conditional progressive Generative Adversarial Networks (GAN), that generates the missing tile in an image, using as input three initial adjacent tiles encoded in a latent vector by a Wasserstein auto-encoder. We focus on a set of images used by the United Nations Satellite Centre (UNOSAT) to train flood detection tools, and validate the quality of synthetic images in a realistic setup.
translated by 谷歌翻译
We introduce the concepts of inverse solvability and security for a generic linear forward model and demonstrate how they can be applied to models used in federated learning. We provide examples of such models which differ in the resulting inverse solvability and security as defined in this paper. We also show how the large number of users participating in a given iteration of federated learning can be leveraged to increase both solvability and security. Finally, we discuss possible extensions of the presented concepts including the nonlinear case.
translated by 谷歌翻译
There is an increasing need in our society to achieve faster advances in Science to tackle urgent problems, such as climate changes, environmental hazards, sustainable energy systems, pandemics, among others. In certain domains like chemistry, scientific discovery carries the extra burden of assessing risks of the proposed novel solutions before moving to the experimental stage. Despite several recent advances in Machine Learning and AI to address some of these challenges, there is still a gap in technologies to support end-to-end discovery applications, integrating the myriad of available technologies into a coherent, orchestrated, yet flexible discovery process. Such applications need to handle complex knowledge management at scale, enabling knowledge consumption and production in a timely and efficient way for subject matter experts (SMEs). Furthermore, the discovery of novel functional materials strongly relies on the development of exploration strategies in the chemical space. For instance, generative models have gained attention within the scientific community due to their ability to generate enormous volumes of novel molecules across material domains. These models exhibit extreme creativity that often translates in low viability of the generated candidates. In this work, we propose a workbench framework that aims at enabling the human-AI co-creation to reduce the time until the first discovery and the opportunity costs involved. This framework relies on a knowledge base with domain and process knowledge, and user-interaction components to acquire knowledge and advise the SMEs. Currently,the framework supports four main activities: generative modeling, dataset triage, molecule adjudication, and risk assessment.
translated by 谷歌翻译
在最初出生在太空行业的基于时间轴的计划方法中,一组状态变量(时间表)的演变受一组时间约束的控制。基于传统时间表的计划系统在整合计划与处理时间不确定性的执行方面表现出色。为了处理一般的非确定主义,最近引入了基于时间轴的游戏的概念。已经证明,发现此类游戏是否存在获胜策略是2Exptime-Complete。但是,缺少合成实施此类策略的控制器的具体方法。本文填补了这一空白,概述了基于时间轴游戏的控制器合成方法。
translated by 谷歌翻译